GitHub repository : 605project
Selected dataset : A dataset about stock price from Kaggle

Goal and Motivation

Data description

The whole dataset is around 13 GB, and 100 stocks (Nifty 100 stocks) and 2 indices (Nifty 50 and Nifty Bank indices) are present in this dataset. Data for each stock is in a separate csv file. This dataset is not only in line with our interests, but also suitable for parallelization computation.

EDA

Since the structures of the 100 csv files are almost same. Here we just select one stock (ACC) for example.

Historical view of the closing price

Sales volume

Daily return

Predicting ACC's closing price